Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new lm minimiser #208

Merged
merged 19 commits into from
Nov 18, 2024
Merged

Add new lm minimiser #208

merged 19 commits into from
Nov 18, 2024

Conversation

RichardWaiteSTFC
Copy link
Collaborator

@RichardWaiteSTFC RichardWaiteSTFC commented Nov 1, 2024

New features:

  • New (simpler) LM minimiser lm4 that support minimising scalars correctly and with optional bounds on parameters
  • Add support to minimise parameters using function handle returning flattened array of residuals
  • Add support to pass residual array rather than scalar in sw_fitpowder (powder fitting class)
  • Add option vary to simplex to specify which parameters to vary in the fit (provided this option in lm4 as well)

Changes to behaviour:

  • cost_function_wrapper (and minimisers which use it - e.g. simplex and lm4) will reset parameters out of bounds provided using same method as lsqnonlin (previously would reset to be nearest boundary, but this didn't work well for the LM method).
  • Also cost_function_wrapper will now throw a warning when parameters have been reset inside the bounds.

@codecov-commenter
Copy link

codecov-commenter commented Nov 1, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 92.35294% with 13 lines in your changes missing coverage. Please review.

Project coverage is 43.29%. Comparing base (73f604a) to head (1fc202a).

Files with missing lines Patch % Lines
swfiles/+ndbase/lm4.m 93.51% 7 Missing ⚠️
swfiles/sw_fitpowder.m 73.33% 4 Missing ⚠️
swfiles/+ndbase/cost_function_wrapper.m 94.11% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #208      +/-   ##
==========================================
+ Coverage   42.86%   43.29%   +0.42%     
==========================================
  Files         242      243       +1     
  Lines       16240    16383     +143     
==========================================
+ Hits         6962     7093     +131     
- Misses       9278     9290      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

github-actions bot commented Nov 1, 2024

Test Results

    4 files  ± 0    124 suites  ±0   4m 47s ⏱️ -2s
  831 tests +40    813 ✅ +40  18 💤 ±0  0 ❌ ±0 
2 324 runs  +84  2 288 ✅ +84  36 💤 ±0  0 ❌ ±0 

Results for commit 1fc202a. ± Comparison against base commit 73f604a.

This pull request removes 2 and adds 42 tests. Note that renamed tests count towards both.
sw_tests.unit_tests.unittest_ndbase_cost_function_wrapper ‑ test_init_with_fcost_both_bounds_with_fixed_param_using_ifix
sw_tests.unit_tests.unittest_sw_fitpowder ‑ test_fit_background
sw_tests.unit_tests.unittest_ndbase_cost_function_wrapper ‑ test_init_with_fcost_all_params_fixed
sw_tests.unit_tests.unittest_ndbase_cost_function_wrapper ‑ test_init_with_fcost_both_bounds_fixed_invalid_param_using_ifix
sw_tests.unit_tests.unittest_ndbase_cost_function_wrapper ‑ test_init_with_fcost_both_bounds_fixed_param_using_ifix
sw_tests.unit_tests.unittest_ndbase_cost_function_wrapper ‑ test_init_with_resid_handle
sw_tests.unit_tests.unittest_ndbase_optimisers ‑ test_optimise_data_struct(optimiser=@ndbase.lm4,poly_func=char_@_x__p__polyval_p__x_)
sw_tests.unit_tests.unittest_ndbase_optimisers ‑ test_optimise_data_struct(optimiser=@ndbase.lm4,poly_func=function_handle)
sw_tests.unit_tests.unittest_ndbase_optimisers ‑ test_optimise_data_struct(optimiser=value2,poly_func=value1)
sw_tests.unit_tests.unittest_ndbase_optimisers ‑ test_optimise_data_struct(optimiser=value2,poly_func=value2)
sw_tests.unit_tests.unittest_ndbase_optimisers ‑ test_optimise_residual_array_lm(optimiser=@ndbase.lm4)
sw_tests.unit_tests.unittest_ndbase_optimisers ‑ test_optimise_residual_array_lm(optimiser=@ndbase.simplex)
…

♻️ This comment has been updated with latest results.

In ndbase.lm4 and simplex optimisers. Also added tests
This allows support for ND fitting problems.
Copy link
Member

@mducle mducle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. It should probably be two different PR, one for the residual return function and one for lm4 but I think it's fine since the residual stuff is small; but I'll split my comments on the two parts:

  • The residual stuff is good. I don't have a problem with using the resid_handle flag. The only other way I can think of to check it is to evaluate the model function and check if it returns a scalar or not but this could be costly just on construction to test what kind of input it is. For lm4 where we evaluate to get the starting cost value then this might be a way but I'm not sure if the simplex minimiser does this and I don't know how to implement it since the wrapper is separate from the mimisers...
  • The docstring for lm4 needs a lot of updating. I guess it was taken from another function?
  • Some of the unit tests (especially for bounded problems) are disabled for lm4 but work for simplex - is this a weakness of lm4? Should we note in the docstring that bounded fitting with lm4 is a bit flaky and maybe suggest users to specify larger bounds?

end

function test_init_with_fcost_all_params_fixed(testCase)
% note second param outside bounds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment is wrong? It doesn't look like you've set any bounds here, just fixed the parameter values?
Looks like this was copied from test_init_with_fcost_no_bounds_with_fixed_param_using_ifix which also has the same (erroneous?) comment in line 96.

testCase.verify_val(cost_val, 0, 'abs_tol', 1e-6);
end

function test_optimise_residual_array_lm(testCase, optimiser)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test should also run for ndbase.simplex too, right? Should the _lm suffix be removed?

+sw_tests/+unit_tests/unittest_ndbase_optimisers.m Outdated Show resolved Hide resolved
testCase.verify_val(pars_fit, testCase.rosenbrock_minimum, 'abs_tol', 1e-3);
testCase.verify_val(cost_val, 0, 'abs_tol', 1e-6);
end

function test_optimise_rosen_both_bounds_minimum_not_accessible(testCase, optimiser)
function test_optimise_rosen_both_bounds_minimum_not_accessible(testCase)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in this test the bounds given are all outside the optimum parameter values of (1, 1) - in this case would we expect all minimisers to try to find the optimum value within the bounds? If so, do you know why lm4 isn't able to do this? Is it something to do with how we've implemented the bounds? (Is it because the gradient becomes too steep near the boundaries?)

(Just running it manually with these bounds and the Rosenbrock function I get optimised parameters (-0.5, -0.5) which is just the lower bound... likewise with the next test (fixed_minimum_not_accessible) where lm4 just returns the lower bound.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checked and lsqnonlin does find the lowest cost val here at (0,0)
I think the problem could be solved by moving slightly away from the boundary edge when we reset invalid parameters outside bounds?

Copy link
Collaborator Author

@RichardWaiteSTFC RichardWaiteSTFC Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeh if you change to have the initial guess 0.1 inside the lb like so
'lb', [-1.1, -1.1]
This test passes for lm4 minimiser

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just had a play and it still seems flaky. For some bounds it works and for others it doesn't... I wonder if we have to transform the diff_step too?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the diff step has to be taken in the transformed coordinate so as not to go over the bounds - the diff step is relative - i.e. scaled by the magnitude of the scaled/free parameter rather than the bound one which I also think it correct. We can discuss tomorrow!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So with implementing the box start parameters in this gist I could get all the tests to pass with for both simplex and lm4, even with the original bounds in both_bounds_minimum_accessible and with a tighter tolerance for upper_bound_minimum_not_accessible (1e-4 compared to 1e-6 for simplex), but I had to change the default nu_up from 10 to 5 in order to get those two tests to fit with the original bounds and tighter tolerance... not sure if this is ok?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mducle - I'll look at this today - in theory I don't have a problem with changing nu_up=5 as long as it still passes the NIST data sets.
Here is the advice given on wikipedia...

An effective strategy for the control of the damping parameter, called delayed gratification, consists of increasing the parameter by a small amount for each uphill step, and decreasing by a large amount for each downhill step. The idea behind this strategy is to avoid moving downhill too fast in the beginning of optimization, therefore restricting the steps available in future iterations and therefore slowing down convergence.[7] An increase by a factor of 2 and a decrease by a factor of 3 has been shown to be effective in most cases, while for large problems more extreme values can work better, with an increase by a factor of 1.5 and a decrease by a factor of 5.[8]

Which I think (if I'm interpreting it correctly) would correspond to nu_up=2, nu_dn=0.33 (similar value to existing default value for nu_dn). I will try nu_up = 10 (existing default), 5, 2 and see how they compare in terms of the number of iterations as well as accuracy. FYI I got my value of nu_up from numerical recipes (or something based on numerical recipes...).

Copy link
Collaborator Author

@RichardWaiteSTFC RichardWaiteSTFC Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I quickly tried this on the Gauss3 NIST dataset (so-called average difficulty with 8 parameters) and get these results (note a negative dcost indicates the chi-squared value we found is larger than the NIST provided value - but as you can see only by a very small amount, this could probably be reduced by changing the default convergence criteria!)

% minimising residual array
% nu_up   nu_dn   iterations  nhess_evals     dcost(%)
% 10        0.3      11          10             -1.1e-9
% 5         0.3       9           8             -1.1e-9
% 2         0.3      11           8             -1.1e-9
% 1.5       0.2      15           8             -1.1e-9

% minimising scalar
% nu_up   nu_dn   iterations  nhess_evals     dcost(%)
% 10        0.3      12         10            -1.2e-7
% 5         0.3      17          8            -6.5e-8
% 2         0.3      16         10            -1.2e-7
% 1.5       0.2      21          9            -1.2e-7

image
I will try on a more difficult dataset...

Copy link
Collaborator Author

@RichardWaiteSTFC RichardWaiteSTFC Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried fitting the Eckerle4 NIST dataset (high difficulty - 3 parameters) - note lm3 can't fit this
Here are the results

% minimising residual array
% nu_up   nu_dn   iterations  nhess_evals     dcost(%)
% 10        0.3      121          81          -5.7e-5
% 5         0.3       64          39          -1.6e-5
% 2         0.3      136          54          -2.3e-7
% 1.5       0.2      166          38          -3.1e-5

% minimising scalar
% nu_up   nu_dn   iterations  nhess_evals     dcost(%)
% 10        0.3      55           38          -0.7
% 5         0.3      47           29          -0.7
% 2         0.3      76           17          -2e4  % very bad fit!
% 1.5       0.2      74           19          -0.7

image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like nu_up=5, nu_dn=0.3 is a bit of a sweet spot! Albeit having not given it the rigorous fitbenchmarking treatment...

swfiles/+ndbase/lm4.m Outdated Show resolved Hide resolved
swfiles/+ndbase/lm4.m Outdated Show resolved Hide resolved
swfiles/+ndbase/lm4.m Outdated Show resolved Hide resolved
stat.func = cost_func_wrap.cost_func;
stat.param = param;
stat.param.Np = cost_func_wrap.get_num_free_parameters();
stat.msg = message;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the docstring this is message - it might be good to change it here to match the docstring...

swfiles/+ndbase/lm4.m Outdated Show resolved Hide resolved
swfiles/+ndbase/lm4.m Outdated Show resolved Hide resolved
@RichardWaiteSTFC
Copy link
Collaborator Author

RichardWaiteSTFC commented Nov 12, 2024

Thanks for the review @mducle - sorry for the long PR, it probably should have been 2 as you say!

I've been on unscheduled leave most of the last two days so as didn't get a chance to give final polish - in particular I didn't get a chance to change docstring (taken from lm3) as I had planned as discussed at stand-up!

Some of the unit tests (especially for bounded problems) are disabled for lm4 but work for simplex - is this a weakness of lm4? Should we note in the docstring that bounded fitting with lm4 is a bit flaky and maybe suggest users to specify larger bounds?

Yes there are 2 tests disabled for the lm4 minimiser - it fails to find the minimum within the constraints (note for these tests the true minima of the cost function is not accessible)! The problem is indeed due to the bounds, if the initial guess is too close to one of the bounds (which happens automatically if the initial parameters are outside the bounds) the lm4 minimiser seems to 'converge' very prematurely (sometimes after only 1 iteration) for one of the change in cost function or parameter step criteria the algorithm. These can be fiddled with (along with the step size used) but it is still flaky - I think this problem is perhaps inherent to any method using finite differences - it would be interesting to compare to lmfit.

I will add a warning when the parameter is reset at the boundary (as lm4 users should want to avoid this).

I also wondered whether a different parameter transformation may be better? But it is worth noting that it is enabled for the majority of tests with bound parameters!

@RichardWaiteSTFC
Copy link
Collaborator Author

RichardWaiteSTFC commented Nov 15, 2024

Since the last review:

  • Have changed default nu_dn=5 in lm4
  • cost_function_wrapper (and minimisers which use it - i.e. simplex and lm4) will reset parameters out of bounds provided using same method as lsqnonlin (previously would reset to be nearest boundary, but this didn't work well for the LM method as discussed). A warning is now printed when this has been done.
    • Had to use a slightly different method to your gist sorry, was some tricky behaviour with setting parameters correctly if fixed (as these are cached in wrapper class)
  • Enabled all the unit tests for the lm4 optimiser (and tightened some tolerances).
  • Add option vary to simplex and lm4 (lm,lm2 and lm3 all have this option) to specify which parameters to vary in the fit.
  • Updated doc-strings (hopefully giving user some idea of what the LM parameters do...at least as I understand them!)

Copy link
Member

@mducle mducle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just a couple of typos in the docstring.

swfiles/+ndbase/lm4.m Outdated Show resolved Hide resolved
swfiles/+ndbase/lm4.m Outdated Show resolved Hide resolved
@RichardWaiteSTFC RichardWaiteSTFC merged commit 7e87dee into master Nov 18, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants